DiscoverAI DailyHierVST Voice Cloning | NVIDIA Perfusion | Meta's AudioCraft
HierVST Voice Cloning | NVIDIA Perfusion | Meta's AudioCraft

HierVST Voice Cloning | NVIDIA Perfusion | Meta's AudioCraft

Update: 2023-08-03
Share

Description

Welcome to AI Daily! Join hosts Farb, Ethan, and Conner as they explore three groundbreaking AI stories First up, HierVST Voice Cloning - Experience zero-shot voice cloning with impressive accuracy using just one audio clip. Next, NVIDIA Perfusion - a small, powerful personalization model for text images, using key locking to maintain consistency. Lastly, Meta's AudioCraft - the fusion of music generation, audio generation, and codecs into one open-source code base, creating high-fidelity outputs.

Quick Points

1️⃣ HierVST Voice Cloning

* Zero-shot voice cloning system achieves accurate outputs with just one audio clip.

* Uses hierarchical models for long and short-term generation understanding.

* Potential challenges in handling longer clips and need for further fine-tuning.

2️⃣ NVIDIA Perfusion

* Personalization model for text images with key locking for subject consistency.

* Only 100 kilobytes, trains in four minutes, and outperforms other models.

* Open-source codebase, but may need improvements for human subjects.

3️⃣ Meta’s AudioCraft

* Audio generation, music gen, and codecs combined into an open-source codebase.

* High-fidelity outputs, 30 seconds of sounds, compressing audio files efficiently.

* Meta making strides in audio AI, impressively opens research use for community.

🔗 Episode Links

* HierVST Voice Cloning

* NVIDIA Perfusion

* Meta's AudioCraft

* ChatGPT String Tweet

* Apple App Store/China Story

Connect With Us:

Follow us on Threads

Subscribe to our Substack

Follow us on Twitter:

* AI Daily

* Farb

* Ethan

* Conner



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.aidailypod.com
Comments 
loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

HierVST Voice Cloning | NVIDIA Perfusion | Meta's AudioCraft

HierVST Voice Cloning | NVIDIA Perfusion | Meta's AudioCraft

AI Daily